个体治疗效果(ITE)预测是机器学习的重要研究领域,其目的在解释和估算粒状水平时的作用的因果影响。它代表了对诸如医疗保健,在线广告或社会经济学的多个申请兴趣的问题。为了促进本主题的研究,我们释放了从几个随机控制试验中收集的1390万个样本的公开收集,通过健康的210倍因素扩展先前可用的数据集。我们提供有关数据收集的详细信息,并执行Sanity检查以验证使用此数据是否有因果推理任务。首先,我们正规化可以使用此数据执行的隆起建模(UM)的任务以及相关的评估指标。然后,我们提出了为ITE预测提供了一般设置的合成响应表面和异质处理分配。最后,我们报告实验以验证利用其大小的数据集的关键特性,以评估和比较 - 具有高统计显着性 - 基线UM和ITE预测方法的选择。
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
This report summarises the outcomes of a systematic literature search to identify Bayesian network models used to support decision making in healthcare. After describing the search methodology, the selected research papers are briefly reviewed, with the view to identify publicly available models and datasets that are well suited to analysis using the causal interventional analysis software tool developed in Wang B, Lyle C, Kwiatkowska M (2021). Finally, an experimental evaluation of applying the software on a selection of models is carried out and preliminary results are reported.
translated by 谷歌翻译
自动临床标题生成问题被称为建议模型,将额叶X射线扫描与放射学记录中的结构化患者信息结合在一起。我们将两种语言模型结合在一起,即表演 - 泰尔和GPT-3,以生成全面和描述性的放射学记录。这些模型的建议组合产生了文本摘要,其中包含有关发现的病理,其位置以及将每个病理定位在原始X射线扫描中的每个病理的2D热图。提出的模型在两个医学数据集(Open-I,Mimic-CXR和通用MS-Coco)上进行了测试。用自然语言评估指标测量的结果证明了它们对胸部X射线图像字幕的有效适用性。
translated by 谷歌翻译
我们提出了Sauron,这是一种过滤器修剪方法,它通过使用自动调整的层特异性阈值丢弃相应的过滤器来消除冗余特征图。此外,Sauron最大程度地减少了一个正规化术语,正如我们所显示的各种指标所显示的那样,促进了特征地图簇的形成。与大多数过滤器修剪方法相反,Sauron是单相,类似于典型的神经网络优化,需要更少的超参数和设计决策。此外,与其他基于群集的方法不同,我们的方法不需要预选簇的数量,而簇的数量是非平凡的,以确定和随着层的变化。我们在三个医学图像分割任务上评估了Sauron和三种最先进的过滤器修剪方法。在这个领域,过滤器修剪很少受到关注,并且可以帮助建立有效的医疗级计算机模型,这些计算机由于隐私考虑而无法使用云服务。索伦(Sauron)比竞争的修剪方法实现了具有更高性能和修剪率的模型。此外,由于Sauron在训练过程中除去过滤器,因此随着时间的推移,其优化加速了。最后,我们证明了Sauron-Prun的模型的特征地图是高度可解释的。 Sauron代码可在https://github.com/jmlipman/sauronunet上公开获得。
translated by 谷歌翻译
同质性是描述边缘连接相似节点的趋势的图形属性。相反称为异性。尽管同质性对于许多现实世界网络是自然的,但也有没有此属性的网络。人们通常认为,标准消息的图形神经网络(GNNS)在非双性图形上表现不佳,因此此类数据集需要特别注意。尽管为异性图开发图表的学习方法已经付出了很多努力,但尚无普遍同意同质的措施。但是,在文献中使用了几种测量同质性的指标,但是,我们表明所有这些度量都有关键的缺点,以阻止不同数据集之间的同质级别比较。我们将理想的属性形式化,以进行适当的同质度量,并展示如何将有关分类绩效指标属性的现有文献与我们的问题联系起来。在这样做时,我们找到了一种措施,我们称调整后的同质性比现有同质措施更满足所需的特性。有趣的是,该措施与两个分类性能指标有关 - 科恩的kappa和马修斯相关系数。然后,我们超越了同质性的二分法,并提出了一种新的属性,我们称之为标签信息性(LI),该属性表征了邻居标签提供有关节点标签的信息的数量。从理论上讲,我们表明LI在具有不同数量的类和类大小平衡的数据集中相当。通过一系列实验,我们表明LI是对数据集上GNN的性能的更好预测指标,而不是同质性。我们证明了Li解释了为什么GNN有时可以在异性数据集上表现良好 - 这是文献中最近观察到的现象。
translated by 谷歌翻译
许多NLP任务受益于使用通常具有超过1000亿参数的大语言模型(LLM)。随着Bloom-176b和Opt-175B的发布,每个人都可以下载该规模的预估计型号。尽管如此,使用这些模型仍需要许多研究人员无法获得高端硬件。在某些情况下,LLM可以通过RAM卸载或托管API更实惠。但是,这些技术具有先天的局限性:对于交互推理而言,卸载太慢,而API的灵活性不足以进行研究。在这项工作中,我们通过加入信任处理客户数据的多个政党的资源来提出花瓣$ - $ $用于推理和微调大型模型的系统。我们证明,这种策略的表现极大地超过了非常大型型号的卸载,以每秒约1美元的价格$ \ $ \ $ \ $ \ $ \ $ \ $ \ $ \ $ 1。与大多数推理API不同,花瓣还本地揭示了服务模型的隐藏状态,从而使其用户可以根据有效的微调方法训练和共享自定义模型扩展。
translated by 谷歌翻译
在收获前的作物产量的准确预测对于世界各地的作物物流,市场计划和食物分配至关重要。产量预测需要在延长的时间段内监测物候和气候特征,以模拟农作物发育中涉及的复杂关系。绕过世界各种卫星提供的遥感卫星图像是获取数据预测数据的廉价且可靠的方法。目前,收益率预测的领域由深度学习方法主导。尽管使用这些方法达到的精度是有希望的,但所需的数据量和``Black-Box''性质可以限制深度学习方法的应用。可以通过提出一条管道将遥感图像处理为基于特征的表示形式来克服局限性,该图像允许使用极端梯度提升(XGBoost)进行产量预测。与基于深度学习的最先进的收益率预测系统相比,对美国大豆产量预测的比较评估显示出了有希望的预测准确性。特征重要性将近红外光谱视为我们模型中的重要特征。报告的结果暗示了XGBoost进行产量预测的能力,并鼓励将来对XGBoost进行XGBoost的实验,以对世界各地的其他农作物进行产量预测。
translated by 谷歌翻译
变更点检测算法的目的是定位过程的时间演变的突然变化。在本文中,我们介绍了潜在神经随机微分方程的应用,以解决变化点检测问题。我们演示了模型在一系列合成和现实世界数据集和基准测试方面的检测功能和性能。大多数研究的方案都表明,所提出的算法的表现优于最先进的算法。我们还讨论了这种方法的优势和局限性,并指示了进一步改进的方向。
translated by 谷歌翻译